Skip to content

Fix isCommand check and set min rows threshold for saveToFile#6377

Closed
turboFei wants to merge 5 commits intoapache:masterfrom
turboFei:check_is_DQL
Closed

Fix isCommand check and set min rows threshold for saveToFile#6377
turboFei wants to merge 5 commits intoapache:masterfrom
turboFei:check_is_DQL

Conversation

@turboFei
Copy link
Member

@turboFei turboFei commented May 9, 2024

🔍 Description

Issue References 🔗

This pull request fixes #
I found that, with saveToFile enabled with the default min size threshold, even I run a simple set command, It also save the result to file.
image

I think we need to skip this kind of queries.

Describe Your Solution 🔧

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Types of changes 🔖

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

Behavior Without This Pull Request ⚰️

Behavior With This Pull Request 🎉

Related Unit Tests


Checklist 📝

Be nice. Be informative.

@turboFei
Copy link
Member Author

turboFei commented May 9, 2024

cc @cxzl25 @wForget @pan3793

@turboFei turboFei requested review from cxzl25, pan3793 and wForget May 9, 2024 04:33
@turboFei turboFei changed the title Skip the save result to file for non data query language Skip the save result to file for non data query language and mini result count query May 9, 2024
@github-actions github-actions bot added the kind:documentation Documentation is a feature! label May 9, 2024
@codecov-commenter
Copy link

codecov-commenter commented May 9, 2024

Codecov Report

Attention: Patch coverage is 29.62963% with 19 lines in your changes are missing coverage. Please review.

Project coverage is 58.48%. Comparing base (12c5568) to head (da9c2a9).

Files Patch % Lines
...g/apache/spark/sql/kyuubi/SparkDatasetHelper.scala 5.88% 16 Missing ⚠️
...uubi/engine/spark/operation/ExecuteStatement.scala 50.00% 2 Missing ⚠️
...in/scala/org/apache/kyuubi/config/KyuubiConf.scala 83.33% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #6377      +/-   ##
============================================
- Coverage     58.51%   58.48%   -0.04%     
  Complexity       24       24              
============================================
  Files           653      653              
  Lines         39895    39916      +21     
  Branches       5482     5485       +3     
============================================
- Hits          23345    23344       -1     
- Misses        14055    14073      +18     
- Partials       2495     2499       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@turboFei turboFei changed the title Skip the save result to file for non data query language and mini result count query Fix isCommand check and set min rows threshold for saveToFile May 9, 2024
@turboFei
Copy link
Member Author

turboFei commented May 9, 2024

thanks @wForget , updated

Copy link
Member

@wForget wForget left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


private def isCommandExec(nodeName: String): Boolean = {
def isCommandExec(result: DataFrame): Boolean = {
val nodeName = result.queryExecution.executedPlan.getClass.getName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result.queryExecution.executedPlan.nodeName?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spark 3.1

scala> spark.sql("set").queryExecution.executedPlan.nodeName
res22: String = Execute SetCommand

scala> spark.sql("show tables").queryExecution.executedPlan.nodeName
res23: String = Execute ShowTablesCommand

scala>

spark 3.5

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is a bug and not covered by UT before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have the method isCommandResultExec before, until we drop spark 3.1 support

private def isCommandResultExec(sparkPlan: SparkPlan): Boolean = {
// scalastyle:off line.size.limit
// the CommandResultExec was introduced in SPARK-35378 (Spark 3.2), after SPARK-35378 the
// physical plan of runnable command is CommandResultExec.
// for instance:
// ```
// scala> spark.sql("show tables").queryExecution.executedPlan
// res0: org.apache.spark.sql.execution.SparkPlan =
// CommandResult <empty>, [namespace#0, tableName#1, isTemporary#2]
// +- ShowTables [namespace#0, tableName#1, isTemporary#2], V2SessionCatalog(spark_catalog), [default]
//
// scala > spark.sql("show tables").queryExecution.executedPlan.getClass
// res1: Class[_ <: org.apache.spark.sql.execution.SparkPlan] = class org.apache.spark.sql.execution.CommandResultExec
// ```
// scalastyle:on line.size.limit
sparkPlan.getClass.getName == "org.apache.spark.sql.execution.CommandResultExec"
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have dropped support for Spark 3.1 in #6273, do we still need to adapt to it?

Copy link
Member Author

@turboFei turboFei May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it did not work for spark 3.5 as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it did not work for spark 3.5 as well

I mean we can simplify the logical to

result.queryExecution.executedPlan match {
    case commandResult: CommandResultExec => true
    case _ => false
}

@turboFei turboFei self-assigned this May 9, 2024
@turboFei turboFei added this to the v1.9.1 milestone May 9, 2024
@turboFei turboFei closed this in 3439ea0 May 9, 2024
turboFei added a commit that referenced this pull request May 9, 2024
…eToFile

# 🔍 Description
## Issue References 🔗

This pull request fixes #
I found that, with saveToFile enabled with the default min size threshold, even I run a simple `set` command, It also save the result to file.
<img width="1718" alt="image" src="https://github.com/apache/kyuubi/assets/6757692/5bcc0da1-201a-453a-8568-d1bfadd7adef">

I think we need to skip this kind of queries.

## Describe Your Solution 🔧

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6377 from turboFei/check_is_DQL.

Closes #6377

da9c2a9 [Wang, Fei] ut
04e20db [Wang, Fei] conf
8f20ed8 [Wang, Fei] refine the check
f558dcc [Wang, Fei] ut
c813403 [Wang, Fei] DQL

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
(cherry picked from commit 3439ea0)
Signed-off-by: Wang, Fei <fwang12@ebay.com>
@turboFei turboFei deleted the check_is_DQL branch May 9, 2024 15:43
@turboFei
Copy link
Member Author

turboFei commented May 9, 2024

thanks, merged to 1.9.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants